Monographs 
General and Applied 


Errors in Time-Study Judgments 
of Industrial Work Pace 


By 
Kalman A. Lifson 


Ocenpational Research Center, Purdue University 


A 
Q 
3 


Price, $1.00 


Edited by Herbert S. Conrad 
Published by The American Psychological Association, Inc. 


: 
No. 355 
1953 
Vol. 67 
No. 5 


Monographs: 
General and Applied 


_ Combining the Applied Prychology Monographs and the Archives of Psychology 
with ‘e Psychological Monographs 


Editor 
Hersert Conran 
Federal Security Agency 


Office of Education 
Washington 25, D.C. 


Managing Editor 
LaRRAINE BouTHILet 


Consulting Editors 


Donatp E. Baier Haroitp Jones 
A, BEACH Donato W. MacKinnon 
Ropert G. BeRNREUTER Lorrin A. Riccs 
A, BROWNELL 


Car. R. Rocers 
Haron E. Burtr 


SAUL ROSENZWEIG 
Ross STAGNER 
PerctvaL M. SyMonps 


Jerry W. Carter, Jr. 
Ctype H. Coomas 

Joun G. Dantzey 
Joun F. joserH 
F.uGENtIA HANFMANN Lepyarp R Tucker 
HEmsReDER. JoserH ZusiNn 


Manusearrs should be sent to the Editor. 
“Because of lack of space, the Psychological Monographs can print only the setatali 
or advanced contribution of the author. Background and bibliographic materials 
must, in general, be totally excluded or kept to an irreducible minimum. Statistical 
tables should be used to present only the most important of the statistical data or’ 
evidence, 

The first page of the manuscript should contain the title of the paper, the author's 
name, and his institutional connection (or his city of residence), Acknowledgments 
should be kept brief, and appear as a footnote on the first page. No table of contents 
need be included. For other directions or suggestions on the preparation of manu- - 
scripts, see: Conran, H. S$. Preparation of manuscripts for publication as mono- 
graphs, J. Psychol., 1948, 26, 447-459. : 

CORRESPONDENCE CONCERNING BUSINESS MATTERS (such as author’s fees, subscriptions 
and sales, change of address, etc.) should be addressed to the American Psychological 
Association, Inc., 1333 Sixteenth St. N.W., Washington 6, D.C. 


CopyRicur, 1953, BY THE Psycuolocical Association, INc. 


Vol. 67, No. 5 


Whole No. 355, 1953 


Psychological Monographs: General and Applied 


Errors in Time-Study Judgments of Industrial Work Pace’ 


Kalman A. Lifson 
Occupational Research Center, Purdue University 


I. INTRODUCTION 


Time studies are made to determine 
how much time should be required to 
accomplish specified elements of work. 
One of the steps in a time study is the 
calculation of the “normal time”—the 
time which the work element requires 
when performed at a “normal” pace. In 
determining the normal time, the time- 
study man records the amount of time 
actually used by the operator he is ob- 
serving, and he also makes a judgment 
of the work pace of the operator. The 
actual observed time is adjusted in accord 
with the judgment of work pace so that 
the adjusted time is representative of a 
“normal” performance. 

Usually the judgment of work pace is 
expressed as a percentage of “normal” 
work pace, and the adjustment is the 
multiplication of the observed time by 
this percentage. If, for example, the time- 
study man believes the observed worker 
is maintaining a pace that is go per cent 
faster than normal, he rates the per- 
formance for the observed work element 
at 130 (omitting the decimal) and multi- 
plies the observed work time by 1.3. 

The judgment is called by various 
names such as “leveling,” “effort rating,” 
“performance rating,” “pace rating,” or 
just plain “rating.” In this paper the term 

‘This paper is based upon a thesis (5) sub- 
mitted to the Graduate School of Purdue Uni- 
versity in partial fulfillment of the requirements 
for the degree of Doctor of Philosophy, August, 
1951. The research was done under the commit- 
tee of Professors E. J. McCormick, chairman, 


Joseph Tiffin, N. C. Kephart, W. V. Owen, and 
M. E. Mundel. 


“rating” will be employed. This rating 
may be defined as the process of com- 
paring an observed work pace to some 
concept of a “normal” pace. ‘Time studies 
are only as reliable and accurate as the 
ratings on which they are based. 


Purpose of Investigation 


This study examined the nature of the 
errors involved in ratings of work pace 
from several aspects: (a) by determining 
the magnitude and assignable causes of 
errors in pace ratings; (b) by determining 
the variation among the consistencies of 
ratings from different sources; and (c) 
by comparing the relative concepts of 
“normal pace” of workers with the con- 
cepts of the time-study men. 

Sources of error. Factors which are 
the assignable sources of error in pace 
ratings are evidenced by constant dif- 
ferences in the average level of those 
ratings which are common to the factor. 
Specifically: (a) different observers may 
have different general concepts of normal 
work pace, and tend to rate either con- 
sistently high or consistently low; (b) 
different jobs may appear easier to some 
observers than to others, and this varia- 
tion may be superimposed on the general 
high- or low-rating tendency of the ob- 
server; (c) different workers, maintaining 
the same tempo of elements per minute 
on a job, may also be rated differently; 
and (d) the normal times derived may 
differ when different levels of pace are 
observed—lower paces may be rated too 
high, and higher paces may be rated too 
low. 


- 
1 


KALMAN 


One of the purposes of this study was 
to determine the magnitude of the error 
due to each of these four factors and to 
their interactions. 

Differences in consistency. Besides the 
differences which may cause some raters 
to rate higher than others, there may be 
differences which cause some raters to 
be more consistent than others. Likewise, 
some jobs may be rated more consistently 
than others, and some workers and some 
paces may be rated more consistently 
than others. 

Consistency is probably more impor- 
tant than the level of rating, since, if a 
rater is consistently too high, he can be 
taught to lower his ratings. But, if a 
rater is not consistent, improvement is 
not as easy. 

A second purpose of this study, then, 
was to determine if there are real dif- 
ferences among the consistencies of rat- 
ings by different raters, on different jobs, 
on different workers, and on different 
paces. 

Differences between judgments of 
raters and workers. Workers who are the 
subjects of time studies, and whose work 
quota is based on time study, form opin- 
ions about the paces they are expected 
to maintain. Workers may not equate the 
jobs in the same way as the raters do. 
‘To the workers, the raters’ idea of nor- 
mal pace on one job may be more difh- 
cult to maintain than the raters’ idea of 
normal pace on another job. Also, what 
may be a relatively easy normal pace to 
one worker may be a relatively difficult 
normal pace to another worker. If this 
be true, it is probably the cause of 
many grievances about the time-study 
system, 

A third purpose of this study was to 
compare the workers’ judgments with the 
observers’ pace ratings. 


A. 


LIFSON 


Il. PROCEDURE 


The general procedure involved the 
rating, by expert time-study men, of 
movies of performances of previously 
established paces. Each of five “workers” 
performed each of four jobs at each of 
five paces, and movies were taken of these 
performances. The workers made an eval- 
uation of each pace on each job. Six 
time-study men rated twice each of the 
performances shown on the film; there 
was an interval of a month between 
rating sessions. 


Workers 


The workers were five Purdue students 
(male) who had been employed at fac- 
tory jobs where they were either on a 
piece rate or quota system. ‘They had 
had no training in time study. ‘They 
were accustomed to making the kind of 
judgments about paces that workers usu- 
ally make. 


Jobs Performed 


The four jobs were elements of factory 
tasks. ‘They were simple enough to be 
learned quickly and to be done in rhythm 
with a metronome. The jobs are de- 
scribed as follows: 

1. Twisting screws. Twisting a %{4-inch screw 
in and out of a threaded hole. The screw was 
twisted half a turn each beat of the metronome. 

2. Stamping. Stamping boxes with a rubber 
stamp. The boxes were 3 inches square. ‘They 
were set in a carton which was 6 inches deep. 
The boxes were 12 inches from the stamp pad. 
The box was stamped every other beat and the 
stamp pad was stamped every other beat. 

3. Twisting bolts. Alternately twisting two 14- 
inch hexagonal bolt heads placed go inches 
apart, requiring movement of the arm about 
the shoulder. One bolt was twisted each beat. 

4. Packing. Packing 36 3 x 3 « 2-inch boxes 
of half-bearings into a carton, The boxes were 
presented in 6 piles, grasped one at a time with 
one hand, transferred to the other hand, and 
placed in the carton in three 3 ¢ 4 layers. When 
packing, one hand grasped a box and the other 
hand positioned a box in the carton on the 


TIME-STUDY JUDGMENTS OF INDUSTRIAL WORK PACE 3 


first beat, and the hand-to-hand transfer was 
done on the second beat. 

All workers were apparently able to 
learn to do these jobs easily in two hours, 
and after that time apparently they were 
not bothered by the imposed metronome 
rhythm. The workers practiced the jobs 
for six hours before they made their 
judgments, and the films were taken after 
that. ‘There was no apparent lack of skill 
on these performances. 


Paces Maintained 


The normal pace for each of the four 
jobs was determined by a preliminary 
study. For this preliminary study, movies 
were made of several workers doing the 
four jobs, and the same six experts used 
in the actual study rated these prelimi- 
nary films. Estimates of the tempo that 
represented normal pace for a job were 
obtained by dividing the actual tempo 
(the beats per minute of the performance 
observed) by the rating for that tempo. 
‘These estimates were then averaged for 


TABLE 1 
METRONOME SETTINGS FOR PACES AND JOBS 


Job2 Job3 Joby 
112} 


125 
1374 
150 
1624 


Pace Jobr 


each job to obtain the tempo which the 
six experts considered the average normal 
pace. 

Each worker performed each job at the 
tempo corresponding to nortnal pace, 
and at go, 110, 120, and 1g0 per cent of 
that tempo. The metronome tempo for 
each pace of each job is given in Table 1. 


Raters 


The expert time-study raters were six 
men who all are making, or have made, 


time studies in industry. Iwo were pro- 
fessors on the industrial engineering staff 
at Purdue. Two were time-study men for 
a large manufacturing firm, and two were 
men with several years of experience in 
time study who had returned to school 
for advanced degrees. 


Film 
Movies were taken of each worker on 
each job at each pace. ‘The film was cut 
and reassembled in a sequence such that 


there was no order which the raters could 
detect. 


Each job was presented once in the first four 
shots, once in the second four, and so on. Each 
worker was presented once in the first five shots, 
once in the second five, and so on. Each pace 
was presented once in the first five shots, once 
in the second five, and so on. The order of 
presentation of jobs, workers, and paces within 
the groups of four or five was never repeated 
in the next group. No two shots of the same 
job were ever in sequence, and only once was 
the same worker shown twice consecutively. 

Each shot was about 24 seconds of running 
time. A blank leader, five seconds long, was 
spliced between each shot. The raters were to 
write down their ratings while this leader ran 
through the projector. All of the raters thought 
that the shots were long enough to rate. ‘The 
film was put on three reels, Fach reel ran about 
16 minutes. 

When the pictures were taken, the camera was 
operated by a synchronous motor drive at 1,000 
frames per minute, The projector was run at 
exactly this speed. A strobotac was used to de- 
termine the projector speed. 


Pace Ratings 


Generally, ratings are given’ in 
“points,” analogous to percentage points. 
An apparently normal pace is rated as 
100 points. A pace 10 per cent slower 
than normal is rated as go points. The 
measurements of error used in this study 
are expressed as the number of points 
in the errors of the ratings. 

Two rating sessions were held about 
a month apart, and all six raters rated 
on each session, In the second session the 


} 90 142 
100 158 125 73 
110 174 1374 80} 
f 120 1894 150 88 
130 2054 1624 95 
| 


4 KALMAN 


order of presentation of the three reels 
of the film was reversed. A rest pause of 
about 10 minutes was taken by the raters 
between reels. 


Method of Obtaining Worker Judgments 


In order to be able to compare the 
time-study men’s ratings of work paces 
with the workers’ judgments of these 
paces, it was necessary to quantify the 
workers’ judgments. These judgments 
represented the workers’ relative evalua- 
tions of the paces on the four jobs. The 
quantification was done by having the 
worker compare, in dollars and cents, his 
feelings about the four jobs. ‘The worker 
performed job 2 (stamping) at the normal 
pace, and was told that by working at 
that rate he could earn $1.00 an hour. 
Then he was given another job to per- 
form at one of the five paces. He was 
asked, “If stamping like that you earned 
$1.00 an hour, how much should you 
be earning now?” After answering, he 
went back to the normal stamping, and 
then to another job pace to make the 
next comparison. This process was re- 
peated until he had compared each job 
at each pace with the standard of stamp- 
ing at normal pace. 

Next, a different job, at a pace that 
the worker had equated to the normal 
stamping, by answering “$1.00 an hour” 
to the question of earnings, was chosen 
as the standard. The judgment process 
was repeated, this time using the new 
standard for comparison. 

Finally, a third standard pace, on a 
third job, was chosen and the process was 
run again. 

‘Thus, for each pace on each job, the 
worker made three judgments. ‘The work- 
ers were told nothing about the relation- 
ship of the paces, or that they were being 
given the same paces over again. The 


A. LIFSON 


job paces were presented in a random 
order. 


Statistical Techniques 


Analyses of variance (2, pp. 176-190) 
were made for the 1,200 time-study pace 
ratings obtained and for the 300 worker 
judgments. The interpretation of confi- 
dence levels was conservative because of 
the nonhomogeneous variances present. 
An analysis of covariance (2, pp. 333-355) 
was made between worker judgments and 
time-study pace ratings on each pace of 
each job. 

Reliability of a category of ratings, and 
of worker judgments, was calculated as 
the standard deviation of the errors of 
the ratings in that category, and is ex- 
pressed as the standard error of the 
ratings. Significances of differences among 
standard errors were tested by Bartlett's 


test (2, pp. 195-197). 
Ill. RESULTS 


Reasonably conclusive answers to the 
questions proposed in the introduction 
to this study are provided by the ‘ta. 
The interpretation of this summa. of 
results is modified by the following 
limitations of the study: the time-study 
men, although they professed the same 
verbal concept of normal pace, were not 
from the same plant; in addition, the 
time-study men may have experienced 
fatigue from so many ratings; further, the 
rating on any one shot may have been 
affected by the preceding shots; the jobs 
themselves were. somewhat simpler than 
typical jobs; and the paces observed did 
not include some of the extreme values 
occasionally seen in actual work. Some of 
these factors would operate to reduce 
error, others to increase it. 

In order to develop the various phases 
of the results, all of the ratings were 


| 


TIME-STUDY JUDGMENTS OF INDUSTRIAL WORK PACE 


TABLE 2 
CONDENSED ANALYSIS OF VARIANCE OF PACE RATINGS 


Sum of 
Squares 


Variance 


obs 

aces 
Workers 
Raters 
Job X Worker 
Job X Rater 
Order 

ace X Worker 
Pace X Rater 
Worker X Rater 
= Rater X Order 

esidual 


Total 


* All significant at .or level. 


converted into “errors” by subtracting 
from the rating the “correct” rating for 
that tempo. The correct rating was the 
ratio of the observed tempo to the nor- 
mal tempo. 

- To facilitate handling the data, the 
errors were coded by dividing them by 
five. The analysis-of-variance tables are 
in terms of this coded error. 


Differences in Level of Rating 


To measure the significance of con- 
tributions to the total variance, the 
mean error for each rater, worker, job, 
pace, order of showing, and interactions 
was calculated. The differences among 
means were tested by an analysis of vari- 
ance. The factors which were significant- 
ly different at better than the .o1 level 
of confidence are shown in Table 2. 
Order refers to the first or second rating 
session. 

Differences in mean ratings of raters, 
workers, paces. Three factors are out- 
standing in their contribution to total 
error: differences among raters, differ- 
ences among workers, and differences 
among paces. The size of the variation 


among mean values of these groups is 
shown in Fig. 1. 

Over one-third of the total variance 
in Table 2 is caused by rater-to-rater 
differences. These differences are very 
practically significant. For example, on a 
standard set by rater 6, a worker would 
have to perform at a pace almost go per 
cent greater than on a standard set by 
rater 2. These six raters were not all from 
the same plant, but they were all using 
the same verbal concept of a normal 
pace. 

The differences in normal times which 
would result from time studies made 
on different workers are also of very 
practical significance. Worker 2, for ex- 
ample, stands out on Fig. 1 as being 
rated between 10 and 15 points higher 
than the others. No reason is advanced 
for this in terms of his physique. A nor- 
mal time based on his performance 
would be 15 per cent slower than one 
based on worker 4. 

The trend to underrate high paces and 
overrate low paces is also marked, and 
shows clearly on Fig. 1. Probably this 
tendency becomes even more exaggerated 


5 
Degrees of 
368 3 123 48 
312 4 78 31 
1,456 4 306 142 , 
4,995 5 999 381 
195 12 16.2 6.4 
559 15 37-3 15 
36 3 12 4-7 
93 16 5.8 2.3 
562 20 28.1 11 
155 20 7.8 3.0 
152 15 10.1 4.0 
2,270 889 2.55 
| 12,151 1,199 


KALMAN 


POINTS FROM MEAN 


A. LIFSON 


90 100 "0 120 130 
PACES 


Fic. 1. Magnitude of Variation among Different Sources. 


at more extreme paces, and forms some 
basis for workers’ preferences to have 
time studies taken on slow operators. A 
standard set on an operator working at 
a pace of go would be 7 per cent “looser” 
than a standard set on an operator work- 


ing at a pace of 130. 

Differences in mean rating of jobs. 
Had the ratings not changed from the 
preliminary film to the actual study, no 
job differences should have occurred. 
Normal pace for each job was taken 
as the average of the raters’ concepts of 
normal pace for that job on the prelimi- 
nary film. ‘The significant difference 
among job means shown in ‘Table 2 is a 
result of the ratings for job 2 (stamping) 
being raised about 5 points from the pre- 
liminary to the actual study. In the pre- 
liminary study the raters saw the stamp- 
ing from in front of the worker. They 
saw his hand moving at them. In the 
actual study the raters saw the worker 
from the side. This may account for the 
change, and also indicates that the angle 
of observation could be a topic for 
further investigation. 

Interactions of practical significance 


among pace ratings. Vhe significant job 
x worker interaction in Table 2 reveals 
a pattern superimposed upon the tend- 
ency for some workers to be rated higher 
than others and the tendency for jobs to 
be rated differently. When the mean 
worker ratings are equated and the mean 
job ratings are equated, there remains a 
tendency for some workers to be rated 
higher on some jobs than on others. The 
size of the standard deviation due to this 
interaction is about two rating points. 

The significant job x rater and worker 
x rater interactions of ‘Table 2 reveal a 
pattern superimposed upon the tendency 
for some raters to rate higher than 
others and the tendencies for jobs and 
workers to be rated differently. There re- 
main tendencies for some raters to rate 
some jobs higher than others and to rate 
some workers higher than others. The 
sizes of the standard deviations due to 
these interactions are about 4 and 2 rat- 
ing points, respectively. 

Pace x rater interaction of ratings. The 
pace < rater interaction in Table 2 indi- 
cates that not all raters underrate the 
high paces and overrate the low paces. 


6 
15 142 
thy 
106 
10 
ty 
5.0 
43 
pal es 37 
° 5 6 on ole 
3.0 3.4 3.2 
-10 RATERS WORKERS 
12.7 
“15 14.4 
i 
4 
i 
/ 


TIME-STUDY JUDGMENTS OF 


Figure 2 shows that raters 2, 5, and 6 
exhibit the tendency markedly, while 
rater 4 does so only slightly, and raters 
1 and g not at all. ‘The size of the stand- 
ard deviation due to this interaction is 
about 5 rating points. 

Other interactions of pace ratings. The 
pace worker interaction in Table 2, 


Fic. 2. Errors of Each Rater on Each Pace. 


which indicates a differential rating of 
the various workers at different paces, is 
statistically significant, but the size of 
the variations is small. The interactions 
involving order are caused by some of 
the raters changing their concepts of 
normal on certain jobs, Although. sta- 
tistically significant, the size of the varia- 
tions is of no practical meaning. 


Magnitude of the Standard Errors 
of Pace Ratings 


The standard error of all the ratings 
is 16 percentage points. This indicates 
that a normal time obtained from these 
ratings stands one chance in three of 
being wrong by +16 points or greater. 

To find the standard error representa- 
tive of the consistency within a category 
of ratings, the variance among categories 
is subtracted from the total variance. 
The residual standard error, with all 
known sources of error eliminated, is 
7.9 points. This represents the standard 


INDUSTRIAL WORK PACE 

TABLE 3 
STANDARD ERRORS FOR VARIOUS PossiBLe 
CIRCUMSTANCES 


Per- 
centage 
Points 


Item 


. Gross Standard Error 
. One job, several raters rating dif- 
ferent workers at different paces. 5.9 
>, One worker on one job, rated by 
several raters at different paces. 5.8 
- One worker on one job at one pace, 
rated by several raters. ‘5 
©. One rater, rating several workers 
at different jobs at different paces. 2 
*, One rater on one job, rating sev- 
eral workers at several paces. + 
3. One rater rating one worker, on 
several jobs at different paces. i 
. One rater rating one worker on one 
job, at different paces. 
One rater rating a worker on a job 
at one pace, 9 


16.0 


error present in any one rater’s single 
rating. 

Standard errors for various possible 
circumstances are shown in ‘Table g. ‘The 
errors (except for the total) are measures 
of reliability rather than accuracy. For 
example, circumstance F might occur 
when a time-study man attempted to 
check a standard for a given job by tim- 
ing several workers. ‘The standard error 
of the normal times about the average 
of this rater would be 10.7 percentage 
points, but this rater’s average might 
differ considerably from the average 
which other time-study men would get. 

Circumstance B might occur in a com- 
pany having the same job in several 
plants. ‘The standard error of the normal 
times about the average would be 15.9 
percentage points. Circumstances C, D, 
and H could occur in checking standards 
by various methods. Circumstances E 
and G might occur in a situation de- 
signed to test the reliability of raters. 


Comparison of Reliability of 
Pace Ratings 
The results of the study have been dis- 


- 
( 
I 
25 
20h. | 
10 I 
he 
-20 
-25 
90 100 0 120 130 
PACE 


KALMAN 


TABLE 4 
STANDARD Errors or PACE RATINGS 


A. Standard Error in Points for Each Rate 


Standard 
Error 
13.2 
11.8 
16.2 
10.6 
10.4 
10.1 


Rater 


B. Standard Error in Points of Ratings 
on Each Worker 


Worker 


Standard 
Error 
14.7 
17.0 
14.2 
14.3 
15.0 


C. Standard Error in Points of Ratings 
for Each Job 


Standard 
Job Error 
1 12.8 
2 37.9 
3 15-3 
4 16.6 


cussed in terms of errors due to constant 
differences in levels of ratings and in 
terms of average standard errors, In ad- 
dition, real differences in the consist- 
encies of ratings within categories do 
exist. As Table 4 shows, some raters are 
more consistent than others, some jobs 
can be rated more reliably than others, 
some workers can be rated more reliably 
than others, and some paces can be rated 
more reliably than others. 

Differences in reliability of pace rat- 
ings of raters, workers, and jc bs. Table 4 
lists the standard error of ratings for each 
rater, worker, and job. The differences 
shown are of practical significance. 
Raters 4, 5, and 6 are much better than 
rater 3, as indicated in Table 4A. Table 
4B shows that a standard obtained on 
worker 3 or 4, for example, would be 
more reliable than one obtained on 


A. LIFSON 


worker 2. The standard for job 2 would 
have to be checked more closely than the 
standard for job 1, according to Table 
4C. 

Differences in reliability of different 
paces. Figure 3 shows how reliability 
varies with pace. Normal pace is rated 
most reliably. The increase in standard 
error seems steeper at lower paces than 
higher. ‘The standard error seems to in- 


¥ 
14.3 


STANDARD 


1 
120 


90 100 
PACE 


Fic. 3. Reliability of Ratings at Various Paces. 


crease proportionally to the increase in 
pace, about 0.2 per cent every 10 points 
of pace. While not of great practical sig- 
nificance within the range of paces 
studied, the trend has importance if ex- 
tended to extreme paces. 


Worker Judgments 

The pace judgments of each worker 
on each job on each pace were subjected 
to an analysis of variance, As with the 
pace ratings, these judgments were 
coded by subtracting the “correct” figure 
from them, and dividing this by five. 
The condensed analysis of variance, list- 
ing the factors which were significant 
at the .o1 level of confidence, is pre- 
sented in Table 5. 

Worker judgments on jobs. Theoreti- 
caily, normal times on different work 
elements or jobs should represent equally 
difficult levels of performance. The con- 
cept of “a fair day's work” should be con- 


130 


8 
1 
2 
3 
4 
5 
6 
1 =14 14.2 
2 4 14.0 
3 13.7 
4 
‘ 


TIME-STUDY JUDGMENTS OF INDUSTRIAL WORK PACE 


TABLE 


CONDENSED ANALYSIS OF VARIANCE 
OF WORKER JUDGMENTS 


TABLE 7 


MEAN ERRORS OF WoRKER JUDGMENTS 
on Eacu Pace 


Error in 
Points 


Vorker X Job 
Residual 


Total 2,277 


* All significant at .oor level. 


stant from job-to-job to both the time- 
study man and the worker. Other things 
being equal, a worker paid on an incen- 
tive plan should be able to earn as much 
on one job as on any other, 

The results show that the workers do 
not equate the four jobs as the time-study 
men do. For example, the workers be- 
lieve that “normal” on packing (job 4) 
is worth 12.7 cents an hour more than 
“normal” on stamping (job 2). Table 6 
shows the differential, in points, between 


TABLE 6 


DIFFERENCE BETWEEN WORKER JuDG- 
MENTS AND PACE RATINGS 


Worker Judgments Minus 
Pace Ratings, in 
Points 


Job 


the workers’ judgments and the pace 
ratings. Because the workers were given 
a concept of normal, this table does not 
indicate that the workers’ concept of 
normal, averaged over the jobs, agrees 
with the raters’ concept. It shows that a 
differential exists to the workers where 
none exists to the raters. 

Worker judgments of different paces. 
The workers exhibit the same tendency 
as the raters to overrate low paces and 


+2.5 
+1.9 
—2.9 
—3.90 


underrate high paces. ‘Table 7 lists the 
mean errors on each pace. 

Worker x job interaction of worker 
judgments. Not all workers judge the 
jobs in the same pattern. When the mean 
worker judgments and the mean job 
judgments are equated, there remains a 
tendency for some workers to judge some 
jobs higher than others. Table 8 shows 


TABLE 8 


WorRKER X Jos INTERACTION OF 
WoRKER JUDGMENTS 


(Points Deviation from the Average for the 
Worker and the Job) 


Job 


Worker ———————— 


the relative judgment made by each 
worker on each job. The numbers are 
the points from the average which the 
worker assigned the job, minus the sum 
of the points from the average of the job 
and the worker. The size of these inter- 
actions is of practical significance, par- 
ticularly workers 1 and 4 on jobs 1 and g. 
Worker 1 thinks that job 3 is worth 
much more than job 1, while worker 4 
has the reverse opinion. On job 1, worker 
1 would be satisfied with a standard time 
about go per cent less than the standard 
time worker 4 would require. On job 3, 


9 
Sum of 
Source DF. VF Pace 
Paces 70 4 17-5  §-37 90 
obs 142 3 47-4 22.4 100 
1,165 16 72.8 14.5 110 
goo 276 3.26 120 
130 
|| 
- 
I —11.8 —6.6 +17.0 
2 —- 0.7 +3.4 — 2.3 —0.4 
a 3 — 1.8 +1.4 — 2.0 +2.3 
4 +17.8 +4.0 —-21.4 —0.5 
5 — 3.5 —2.3 + 8.6 —2.8 
I —3.2 
2 —6.0 
3 +2.5 
4 +6.7 


10 KALMAN 


worker 1 would look for a standard time 
about 40 per cent greater than the stand- 
ard time with which worker 4 would be 
satisfied. 

Standard error of the worker judg- 
ments. The standard error of the worker 
judgments is comparable to the standard 
error of the pace ratings. The total stand- 
ard error is 13.8 points, compared to 16.0 
for the pace ratings. The residual stand- 
ard error is 9.0 points, compared to 7.9 
for the pace ratings. 

Reliability of worker judgments. Real 
differences in reliability of judgments 


TABLE 9g 
STANDARD ERRORS OF WORKER JUDGMENTS 


Worker 


wun 


B. Standard Error in Points for Each Job 


Job 


Standard 
Error 


exist. Some workers are more consistent 
than others, and some jobs are judged 
more reliably than others. 

Table 9 lists the standard error of the 
judgments of each worker, and for each 
job. The differences shown are of prac- 
tical significance. Some of the workers 
are far more reliable than others, as indi- 
cated in ‘Table gA. The standard error 
of workers 2 and g is half that of the 
rest, and compares favorably with the 
standard error of the time-study men. 
There is no apparent relation between 
the worker's being consistent in his judg- 
ments and the reliability with which he 
is rated, 


A. LIFSON 


The workers do not agree as well on 
some jobs as on others. The jobs in 
Table 9B on which the worker judg- 
ments are relatively unreliable are not 
the same jobs on which the pace ratings 
are relatively unreliable. 

The difference in variance among paces 
is not statistically significant. The stand- 
ard errors of the judgments seem to in- 
crease as pace increases; this was true 
also of the standard errors of the pace 
ratings. 

Correlation between pace ratings and 
worker judgments. An analysis of the co- 
variance between ratings of the five work- 
ers on the four jobs and the judgments 
of the workers about the jobs reveals a 
correlation within the workers and jobs 
of + .46. The analysis of covariance sta- 
tistically equates the mean job ratings 
and worker judgments. The positive cor- 
relation indicates that, to the extent in- 
dicated by the correlation of .46, a 
worker will be rated relatively high on 
a job which he feels should be rated rela- 
tively high. 


IV. PREVIOUS STUDIES 
OF RATING 

Several experimental investigations of 
time-study rating have been published 
and are pertinent to the present study. 
‘Two have shown that filmed perform- 
ances are rated the same as live per- 
formances. Other studies have attempted 
to measure the consistency of ratings of 
work pace, but in general they do not 
consider the effects of different raters, 
workers, jobs, paces, and interactions. 
Some investigators have shown that 
rating ability is improved by training, 
and others have been concerned with 
predicting rating ability. One group of 
researchers has been interested in the 
development of aids for rating, in the 
form of films of known paces. 


A. Standard Error in Points for Each Worker 
Error 
2 
-3 
= 
I 15.0 
2 8.3 
3 18.3 
4 9.7 


TIME-STUDY JUDGMENTS OF INDUSTRIAL) WORK PACE 11 


Use of Film for Rating 


The results of the present study would be 
invalid if performances on movie film are rated 
differently from live performances. Mundel and 
Margolin (8) and Barnes (1) have shown that 
ratings from movies are as consistent and accu- 
rate as ratings of actual workers. 


Measurements of Consistency 


Although few experimenters report the stand- 
ard errors of the ratings they have obtained, 
their data can be manipulated to find the stand- 
ard errors. Thirteen points represents the mean 
and the mode of all of the standard errors of 
ratings of work paces found in the studies cited 
in the References section. This is compatible 
with the findings of this study, that the total 
standard error is 16 points, because the previous 
studies have measured the reliabilities of single 
raters, or ratings on single jobs. None of them 
has had ratings on more than one worker on 
a job. All but one of the reported standard 
errors are within the range of from 8 points for 
residual to 16 points for total standard error re- 
ported in the present study. 


Training Raters 


Several investigations have been made as to 
whether rating can be improved by training and 
experience, One of Mundel’s students (7) found 
that raters with over six months’ experience had 
a standard error which was % the size of that 
of raters with less experience. 

Barnes (1) cites three studies of the rating of 
walking, wherein the standard error was re- 
duced from 16 points to 12 points following a 
two-day training session. During a six-month 
training period, in which a group of raters was 
trained to rate the specific job on which they 
were tested, standard error went as low as 3 
points. The raters could not be expected to main- 
tain this level of consistency on other jobs. 

Lifson (4), with time-study students rating 
some films of factory jobs, reduced the stand- 
ard error from 11 to g points in an eleven-weck 
training period. He also noted that the raters 
who were originally most consistent improved 
the most. Another finding was that the con- 
cept of normal pace can be shifted very easily. 
He changed the level of rating by demonstrating 
the level of activity which was to be considered 
as normal, and the ratings shifted as the dem- 
onstrated level shifted. 


Predicting Rating Ability 


Lifson (4) found that ability to rate after 
training could be predicted from ability to rate 
before training. The correlation between a con- 


sistency measure before and alter was +45. He 
observed that ability to rate can be predicted 
even more successfully after the raters have prac- 
ticed for about two weeks. Holder (3) found no 
correlation between rating ability and scores in 
the Purdue Adaptability Test, the Purdue Physi- 
cal Science Test, the Purdue Mathematics Train- 
ing Test, the Purdue Placement Test in Eng- 
lish, A.C.E., or motion- and time-study tests. 


Aids for Rating 


In an attempt to objectify the rating pro- 
cedure, Dr. M. E. Mundel has worked with a 
system of rating that involves the use of aids 
in the form of movies of established levels of 
pace. The rater compares the pace in question 
to the established standard. He attempts to com- 
pare acceleration alone, and later adjusts for 
job difficulties. A paper by Mundel and_ his 
students (7) reports that by using a_ single 
standard film, representing a normal pace, the 
standard error (with no job or worker vari- 
ance) was reduced from the “unaided” value 
of 14 points to 13 points. Use of a series of 
twelve established paces, ranging from 80 to 155, 
further reduced the standard error to 12 points, 


V. IMPLICATIONS 


From the pace ratings of experienced 
time-study men, it was determined that 
the magnitude of the standard error of 
these pace ratings is 16 percentage points, 
and that much of this error is due to dif- 
ferences among raters, workers, jobs, 
paces, and their interactions. 

The residual standard error is 7.9 per- 
centage points. It has also been shown 
that there are real differences in the re- 
liability of ratings from different sources, 
that what are two equal paces to time- 
study men may not feel equal to the 
workers performing at the paces. 

Some further researches and proce- 
dures are required to help increase the 
consistency and accuracy of pace rating. 
Some specific areas of attention which 
are suggested by this study are discussed 
below. 


Reliability of Raters 


Some raters are more reliable than others. 
More research is needed in the prediction of 


12 KALMAN 


reliability, and also in what it is that causes 
some raters to be reliable. Because of the wide 
differences in reliability shown among the 
raters in this study, the need for testing experi- 
enced time-study men is revealed. More research 
is also needed in methods of training time-study 
men to rate consistently. 


Agreement of Raters 


Some raters rate higher than others, even 
when all raters have the same verbal concept of 
normal pace. Agreement among raters is a major 
problem. It is apparent that time-study men 
need concentrated training in recognizing a 
common concept of normal pace, and frequent 
checks to assure that their concepts have not 
changed. Research is needed into methods of 
objectifying the rating function and into meth- 
ods of securing agreement among raters. 


Averaging of Several Ratings 


Consideration should be given to the fact that 
averages are more accurate and reliable than 
individual ratings. Although it is expensive to 
have more ratings made, it is probably less ex- 
pensive than setting standards which may be 
go per cent off. 


Differences in Ratings on 
Various Workers 


Some workers are rated higher than others, 
even when performing the same jobs in the same 
time, and some are rated more reliably. No re- 
search has been done, though a great deal is 
needed, into why some workers are rated dif- 
ferently from others, The reliability and level 
at which diffierent workers are usually rated 
should be known to time-study men. When 
there is a choice, they should have as the opera- 
tor a worker who is rated reliably and at an 
average level. 


Reliability of Different Jobs 


Some jobs are rated more reliably than others. 
Increased reliability may come from knowing 
why. 


Paces Rated Differently 


Normal paces are rated most accurately and 
consistently. When there is a choice, time-study 
men should avoid timing extremely fast or slow 
operators. They should look for an‘ operator 
whose performance is very close to their idea 
of normal. 


A. LIFSON 


Reliability and Accuracy of 
Worker Judgments 

In order for any time-study system to work, 
paces which the time-study men rate as equal 
must seem equal to the workers. Time standards 
are not consistent from job to job unless the 
workers believe the standards are consistent. This 
study shows that this belief is not always held. 

Much more can be done with worker judg- 
ments. They should be used, in some way, as 
the criterion with which to test the accuracy 
and consistency of time-study ratings. They 
should be used to determine adjustments and 
allowances. This study shows that they are reli- 
able enough to be used for these purposes. 

Some of the workers judge more consistently 
than the time-study men rate. This suggests the 
possibility of obtaining ratings from certain 
workers, or of having the time-study men rate 
while they themselves are performing the jobs. 


VI. SUMMARY AND CONCLUSIONS 


In a study designed to determine the 
nature of the errors involved in time- 
study ratings of work pace, six expert 
time-study men made ratings of the 
filmed performances of five workers doing 
each of four jobs at each of five previ- 
ously established paces. The variance of 
the ratings was analyzed to reveal how 
much of the total error was attributable 
to differences among the time-study men 
in their concepts of normal work pace, 
differences in the way each worker and 
each job were rated, and differences in 
the rating errors made at each of the 
paces. Also, the standard errors of each 
group of ratings were compared, as a 
measure of relative reliability. 

While the workers were performing 
the jobs, they evaluated, in relative 
terms, each pace on each job. Analysis of 
variance was applied to these judgments 
to determine the significance of the errors 
involved. Also, the relative level of the 
worker judgments on each job was com- 
pared with the relative level of the time- 
study men’s pace ratings, to determine 
the amount of discrepancy. 


TIME-STUDY JUDGMENTS OF INDUSTRIAL WORK PACE 13 


The following conclusions are pre- 
sented: 

1. Pace ratings involve considerable 
error. The total standard error of pace 
ratings from different raters, on different 
jobs, workers, and paces was 16.0 per- 
centage points. The standard error of one 
rater rating one worker on a job at a pace 
was 7.9 points, on the average. 

2. Some raters rate higher than others. 
The highest rater averaged 28 points 
above the lowest rater. 


140 


RATINGS 


RATERS 


Fic. 4. Mean and “Range” of Ratings by Each 
Rater of the Average Normal Pace. (The “range” 
includes 95 per cent of the ratings by each rater.) 


3. Some raters are more consistent than 
others. ‘The most consistent rater had a 
standard error 5% the size of the standard 
error of the least consistent rater. Figure 
4 shows, for each rater, the spread that 
would include 1g out of 20 ratings of the 
average normal pace. 

4. Some workers are rated higher than 
others, even when all perform the same 
jobs at the same paces. A standard time 
set on the worker who was rated the 
highest would have been 15 per cent 
longer than one set on the worker who 
was rated lowest. 

5. Some workers are rated more re- 
liably than others. The standard error of 
the ratings on the most reliably rated 
worker was 5% the size of the standard 


error of the ratings on the least reliably 
rated worker. 

6. The raters tend to overrate low 
paces and underrate high paces. A stan- 
dard time set on a pace of go would have 
been 7 per cent longer than a standard 
time set on a pace of 130. 

7. Normal pace is rated most reliably. 
The standard error increased slightly at 
paces slower or faster than normal. 

8. Some jobs are rated more reliably 
than others. The standard error of tie 
ratings on the most reliably rated job was 
34 the size of the standard error of the 
ratings on the least reliably rated job. 

g. Interactions are important. Some 
raters rated some jobs higher than others; 
some raters rated some workers higher 
than others; some workers were rated 
higher on some jobs than on others; and 
not all raters followed the pattern of un- 
derrating high paces and overrating low 
ones. 

10. Workers’ judgments on equating 
the jobs differ from the pace ratings of 
the time-study men. The workers be- 
lieved there was as much as a 12 per cent 
per hour differential between two “nor- 
mal” paces as established by time-study 
men. 

it. Individual differences among the 
worker judgments are very important. 
On the same job there was as much as a 
4© per cent difference between two work- 
ers’ judgments as to what was a normal 
pace, 

12. Some workers can judge more re- 
liably than time-study men can rate. The 
standard error of the judgments of the 
two most consistent workers was 34 the 
size of the standard error of the best 
time-study men. 

13. A correlation of +.46 exists be- 
tween the workers’ judgments and the 
pace ratings of the time-study men. ‘To 


a 
120 
oom 
100 
| 
60 
' 2 3 4 5 6 
/ / 


14 KALMAN 


the extent indicated by this correlation, 
on a job which a worker judged rela- 
tively high, he would be rated relatively 
high; and on a job which a_ worker 


judged relatively low, he would be rated 
relatively low. 


A. 


LIFSON 


In conclusion, this study has indicated 
the nature of the errors which are in- 
volved in time-study pace ratings, and 
has indicated some possible approaches 
to the reduction of these errors. 


REFERENCES 


. Barnes, R. M. Work measurement manual. 
(grd Ed.) Dubuque: Wm. C. Brown Com- 
pany, 1947. 

2. Epwarps, A. L. Experimental design in psy- 
chological research, New York: Rinehart, 
1950. 

3. Hortper, W. B. Applicability of six standard 
tests in the prediction of time study rating 
ability. Unpublished master’s thesis, Pur- 
due Univer., 1949. 

4. Lirson, K. A. Performance rating. Time 
Study Engr, 195%, 6 (June), 179-181. 

5. Lirson, K. A. psychological approach to 
pace rating. Unpublished doctor's disserta- 


tion, Purdue Univer., 1951. 

). Lyncu, H. R. Rating of time studies. In Pro- 
ceedings of the 5th annual time study and 
methods conference. 1950. 

. Munpet, M. E. (Ed.) Report of the 5th an- 
nual motion and time study work session. 
Lafayette: Purdue Univer., 1950. (Mimeo- 
graphed) 

8. Munper, M. BE. & Marcouin, L. Report of the 
qth annual motion and time study work 
session. Lafayette: Purdue Univ., 1948. 
(Mimeographed) 

. H. A. A’ study on effort rating. 
Modern Mgmt, 1949, g (April), 19-20. 


(Accepted for publication January 23, 1953.) 


